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ABSTRACT 

This is the final report of a series of 19 
experiments designed to study impasses in the learning of skills with 
a strong perceptual component. Several series of experimentB were 
designed with the purpose of producing experimentailly manipulatJle 
impasses or plateaus in the course of learning. Over 200 subjects 
(humans and animals) in learning studies identified targets in 
various ccMjpiex computer-presented displays. Among the factors 
manipulated were: (1) complexity; (2) noise; (3) salience; (4) 
biasing instructions; and (5) the distribution of target features 
across toundaries of displays. Impasses were produced, but patterns 
of impasse phenomena were not reproduced reliably enough to support 
or disconfirm a theory of impasses in learning. It is suggested that 
the best available tools for studying impasses in learning are 
probably the tools of comparative expertise (expert-novice) research, 
rather than those of the learning study. Five figures supplement the 
text. (Author/SLD) 
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Abstract 



This Is the final report of a series of experiments designed to study 
impasses in the learning of skills with a strong perceptual component. 
Several series of experiments were designed with the purpose of 
producing experimentally manipulabie impasses or plateaus in the 
course of learning. &ibjects in learning studies identified targets in 
various complex computer-presented d&plays. Among the factors 
manipulated were complexly, noise* salience, biassing instructions, 
and the distribution of target features across bouj:idaries of displays. 
Impasses were produced, but patterns of impasse phenomena were not 
reproduced reliably enough to support or disconHrm a theory of 
Impasses in learning. 
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The Concept off a Learning Impasse 

This project was motivated hy es^eriences in prior work on medical 
expertise and Its acquisition (Lesgold. 1984a,b: Lesgold, Rublnson el ah, 
1988). We found that medical diagnostic performance showed certain 
aspects of nonmonotone change with practice, and this led us to wonder 
whether learning could be enhanced by lUiding ways to avoid apparent 
plateaus and setbacks. The concept of learning plateaus has had a 
checkered hlstoiy in ps3rcholog>r (cf. Kella-, 1958). but the discussions of 
plateaus were very superficial, simply asserting that they resulted from poor 
behavioral en^eerlng and would not occur in any sensible instructional 
setting. We fdt that modem scioice and technology created many 
circumstances in which plateaus might occur, and we ^^^nted to gain some 
explanatory and experimental control over the phenomenon. 

Our experience with Impasses in learning came from studies of 
radiological expertise (Lesgold. 1984 a,b: Lesgold, Rubinson et ah. 1988) and 
especially from learning studies that we conducted near the end of the 
radiology studies. The first phenomenon we noticed occurred in studies 
using an expert-novice type of comparative paradigm. We had no real 
novices. Rather, we compared radiologists with live or more years of post- 
residency experience with two groups of residents having either less than two 
years of residency experience or more than two years. In those studies, we 
found that the more advanced group of residents were less successful than 
either the junior resident group or the senior staff group. While the. numbers 
of subjects were small, the effects were consistent. In several cases, junior 
residents in one study were accidentally used later as senior residents in a 
second study; on the same films, they reverted from correct diagnoses earlier 
in their careers to incorrect diagnoses later. 

We also conducted a number of training studies in which we tauglit 
people over hundreds of trials to "diagnose" artificially generated displays that 
were similar to chest x-ray pictures and based on a more-or-less accurate 
anatomical model of the chest. In th^e unpublished studies, we varied the 
amount of conceptual knowledge about the chest that was provided to 
subjects, and we found that subjects taught gn appropriate mental model for 
the chest and its connection with the displays took as long or longer in 

t 
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initial learning and showed no greater tmnsfer to displays based on 
variaUons in the chest "diseases" on which the original displays were based 
leg., collapsed lell upper lung instead of collapsed right middle lung) than 
subjects who did not receive the conceptual training. Further, some display 
types showed no learning over long periods o*" training (i.e., no movement 
above chance performance). 

After reading some of the literature on non-monotone aspects of 
development and some of the concept learning literature, it became apparent 
to us that certain aspects of modem life create opportunities to view the 
world In ways that are more subject to learning Impasses than might be the 
case In a more "natural" world. Our view has been. In essence, that 
impasses occur only In cases where (a) the situation to be understood or 
recognized is extremely complex, (b) the structure of features apparent in the 
situation does not niap very directly onto any model of the world that the 
learner might have, and (c) the learner has not yet acquired any direct 
organlajatlon of the microfeatures of the situation into higher-order features 
that might have such a direct mapping into his/her conceptual model 
repertoire. 

One example of such a situation is passive sonar image interpretation. 
Passive sonar images are distributions showing energy levels of ditTcrent 
sound trequencles over time. The "objects" in such displays do not nuip 
directly onto the objects of the ocean environment. Rather, they map onto 
summations of sound producinf activities. Further, each scnmd producing 
activity Is likely to produce several unique "objects" in a distribution of 
spectral energy over time, and individual components of such "objects" may 
be closer to components of other "objects" than to each other. Accordingly, 
the potentially meaningful units according to the Gestalt rules may not be 
meaningful at all. Such situations seem likely to be arilficlal—biiscd on 
some man-made artifices—rather than naturally occurring, l*hey are not 
entirely novel, but they are certainly more common with new technologies. 
Other situations of this sort Include 12"lead electrocardiograms, well logs 
from oil exploration studies, and densely- packed printed circuit and V1>SI 
layouts. 

We hoped to bring the Impasse phenomena produced by such 
situations under experimental control, and that was the purpose of this 
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prefect We were not entirely successful. Indeed, we asked ONR not to 
consider the optional third for our contract, because we feel that 

significant progress must a\wait the development of entirely diflcrent 
experimental approaches than tliose we took. After performing 19 
experiments, we stlU find ourselves unable to demonstrate and control 
Impasse phenomena adequately to meet our standards of empirical scierci 
In the sections that follow, we summarlsse theoretical viewpoints of possible 
relevance, our many empirical studies, and our final conclusions. 

Theoretical '^evs of Impasses 

There are several levels at which one can view learning impasses. 
Clearly, they can be seen at the cognitive level hinted at in the discussion 
above, either fully within a theoretic^ stance based on mental models or 
from a developmental point of view. However, they might also be seen from 
a behavioral point of view or from a perceptual learning point of view, and 
certain aspects of these non-cognitlve viewpoints seem worthy of note. 

The Behavioral View 

The conditioning literature contains references to certain cases in 
which stimulus patterns either are not condilionable to responses or else 
take a long time to become conditioned. Two related phenomena that have 
been reported are overshadowing and blockitig (cf. Mackintosh, 3975). Both 
refer to situations tn which one stimulus which Is correlated witli another 
cannot be conditioned to a response. Ovenshadowing is a phenomenon 
originally rejxjned by Pavlov, in which a more salient stimulus, when 
conditioned to a response, prevents the conditioning of a less salient but 
equally relevant (i.e.. predictive) stimulus to that response. For example, if a 
weak thermal stimulus is presented shortly before food is supplied, a dog will 
learn to salivate in response to that stimulus. However, if the thermal 
stimulus is always accompanied by a loud noise, only the noise will be 
conditioned. 

Blocking is a term introduced by Kamin {1969) in which conditioning 
one stimulus to a response prevents later conditioning of a second element 
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after both are presented together. For example, if light Is used to signal a 
shock and then later light and noise together signal the coming shock, the 
noise alone will not come to elicit any shock- related response. This 
phenomenon is similar to one seen in some of our experiments on voice 
spectrogram recognition described below. 

Mackintosh {19*;^} suggested that a stimulus will be conditioned to the 
extent that it slj^ials a change from vi^at could have been predicted without 
It Further, he theorized, stimuli that have no marginal predictive power 
become less conditlonable. TO the extent that a stimulus's predictive power 
is, or appears to the subject to be. stochastic, a change in predicti\T power 
will take time to notice. Hence, if Mackintosh Is correct, a stimulus without 
predictive pou-er that becomes predicttve will Irdtialiy suffer a period of slow 
learning because of the compounding of the partial reinforcement elTect and 
the initially lower learning rate due to historically being low in marginal 
predictive capability. 



The Feature Sampling View 

The behavioral data Just reviewed may seem of minimal relcv'ance to 
impasses in cognitive learning, but it does prompt us to notice several 
aspects of the Impasse situations we have examined and to better 
understand how those situations deviate from experimental paradigms tlial 
have been employed in studying plateaus and impasses. Concept learning 
experiments tend to use relatively simple displays. The most common type of 
experiment uses displays in which there are a small number of dimensions 
varied, each involving a small number of display features, e.g., single vs, 
double borders, square vs. triangle, one vs. two central forms, red vs. blue, 
etc, A second type of display form that has been used in experimental work 
is the random deviation from a prototype. The so-calkd Atlneave Figure is 
such a form. To define each prototype, a set of randomly plotted points is 
connected to create a polygon. Instances of the prototype are created by 
introducing small random perturbations of the exact locations of the vertex 
points. T.iree instances of the same prototype are shown in Figure 1 below. 

Attneave figures and the simple displays of concept learning 
experiments can be contrasted with the much more complex displays that 
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Figure 1. Sam^e Attneave Figure Variations from a Prototype. 



were the target of this project, passive sonar displays, voice spectrograms, 
and the like. In the figures that have been used for experimental work, the 
features that might play a role in defining categories are relatively evident. 
In ccmtrast the meaningful features of the noisy artificial displays In which 
we were interested are very difficult to isolate. Sometimes, critical features or 
feature relationships are never noUced ov^ the course of several hours of 
experlmentaUon, In this respect, standard methodologies of concept learning, 
which look at the relative speed at which different kinds of concepts are 
acquired, and perceptual learning experiments which look at the relative 
speed at which different display types come to be recognized, were not suited 
to our goals. As will be seen below, when we used realistic stimuli, many 
subjects fialled ever to learn what to notii e. When we used simpler stimuli, 
we failed to get impasse effects. 

The time needed to discover which fcal '.:'es are relevant in a 
perceptual recogmuon learning task is an important measure. For example. 
Zeaman and House {1963; see also Fisher & Zeaman, 1973} found that 
retardates differed from normal subjects in how long it look them to riotice 
relevant stimulus features. Once features were noticed by retardates, their 
improvement curves looked about the same as those for normal subjects. 
This motivates an experimental paradigm in which trials unui learning starts 
to be evident is a basic measure. However, with tiiC materials in wliich we 
%vere Interested, such experiments proved impossible to run successfully. In 
order to be pi-actlcal and yet of suHlcicnt power, the eKperimcnts required 
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withln-subject raanlpulatloiis. However, when learning failed to occur at all 
for some cases, these wlthin-subJect studies were not entirely concluslvT. 

The dlfliculty problem makes it Impossible to clearly separate iwo 
Important potenUal causes of perceptual learning Imj^sscs. One is Inability 
to notice critical features, as Just discussed. A second, and one that we 
think is important (see the discui^lons below of our artillcial voice 
spectrogram studies) is whether crtUcal feature comblnaUons consist of 
features that are aU within the same meanlngftil region of a display or not. 
As a specific example, consider the case of voice spectrograms for syllables. 
In such displays, it is possible, and obviously meaningful, to parse Uie 
display into segments corresponding to individual phonemes. The display 
plots Ume on the X axis against frequency on the y axis, and it makes sense 
to split up the total time uito the periods In which each of the phonemes of a 
syllable were uttered. However, since it also takes time for the speech 
apparatus to reconfigure from one phoneme to the next, some of the cues for 
identifying one phoneme are to be found In the features of the phoneme 
Immediate^ before or after. For example, distinguishing /d/ from /g/ is 
generally difficult to impossible without examination of the features of the 
vowel that follows {as in dig vs. gig). 

This is an example of the general problem, died above, in which the 
apparent spatial components of a display do not map well onto the 
components of the events that gave rise to the display. Unfortunately, we 
failed to gain control over this kind of situation. While some of our final 
experiments demonstrate weakly that such a problem is significant, we could 
not control Its emergence well enou^ to permit the kinds of instructional 
studies we wanted to carry out. This futccme is particularly discouraging 
because better theoretical apparatus is lelng developed for understanding 
how people come to discover the feature clusters that arc relevant to a 
learning task. For example. Billman and Belt (1988) have simulated the 
elTects of some very general, or weak, mctacognltlvc methods of focused 
sampling of potential rules for mapping features and feature comblnaUons 
onto categories, a significant step beyond the simple fonuulatlons of Zcaman 
(French & Zeaman. 1973: Zcaman & House. 1933). 
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The Developmental View 

The developmental literature also provides quite a bit of theoretical 
power for doling with leaniing impasses. Again, the problem is ^hai we 
could not gain adequate experimental control to apply current theory. Stage 
theories of cognitive development are Inherently theories of impasse, asserting 
that certain learning, possible at later stages of development, cannot occur 
earlier. In fact, the developmental literature Is replete with examples of non- 
monotone learning curves, situations in which performance suffers setbacks, 
in terms of some fixed criterion, over the course of practice (Bowerman. 
1982; Kanniloff-Smlth. 1979: KarmUoO'-Smith & Inhelder. 1974/1975: Klahr. 
1982; Richards & Sle^r. 1982; Stavy. Strauss. Orpaz. & Carmi, 1982; 
Strauss & Stavy. 1982). In lact, Strauss & Stavy (1982) Usled Eve kinds of 
nonmonotone performance possibiliti^: 

1. Mowment from a practiced but inadequate mental 
representation of a task situation to a more powerful but less- 
well-practiced representation. 

2. Uncoordinated combination of two different mental 
representation syistems, 

3. Using newiy-leamed rules that are correct for one 
situation in apj^rently related situations for which they are 
incorrect. 

4. Having lower-order rules to deal wltJ-j each of two task 
variables but not having the higher-order rules to coordiruUe 
th^e lower-order rules. 

5. Having problems adapting a newly-acquired weak 
method to a specific situation for which a more domain-speciilc 
•strong method must t>e evolved before the new mclacognilivc 
knowledge can be effective. 

We believe that the problems laced by people trying to learn to 
recognize displays like passive sonar images and voice spectrograms do 
Indeed Involve mental representation inadequacies, but they are perhaps of a 
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slightly different character than has been examined in the developmental 
literature. The problem appears to be that in order to quickly apprehend 
these artificial displays, one must be able to recognize complex features that 
are not physically clustered acconiing to the Gestalt laws (e.g., the features 
close together may not be related and ones for ai»rt might be closely 
related). Generally, in order to handle such situations, one needs to be able 
to recognize the relevant lower-order features, to know parsing rules for 
sorting out which lowcar-ordei feature cluster together, and to understand 
the meaning of the clusta^. 

This is not something that people are good at, in general. After all. 
the case of speech perception is remarkably similar. The superficial 
clustering, in i< ms of bursts of sound, for spoken language does not match 
word boundaries veiy weU (e.g.. goo/d eve/nlng or a/Uon/s en/Jants def la 
pa/tTi/^. Rather, we become highly practiced at matching these sound 
patterns to representations of the concepts to which they refer, even though 
that requires a highly specialized parsing. Hjis parsing ability does not arise 
without extensive practic**. Even moving from one language to another 
requires substantial px-actice. Further, in the speech understanding case, 
our own experience tells us that the study of vocabulary and grammar do 
not, themselves, permit understandir^ of the spoken word — one has to 
practice conversations extaisively to learn to understand a new language as 
spoken. Prior reading knowledge certainly helps, but only to a point. 

The time course of such practice makes it very difficult to conduct 
learning studies. As a result, much of developmental psychology Involves 
comparisons of performance ^^f different people selected from different points 
in the leaming/dev^opmeni curve. Further, extensivu interactions and 
verbal thlrjking-aloud protocols are ofltn used. This is sufficient for 
characterizing the course of development, but it does not admit readily the 
possibility of studying systematical^ varied experience tracks. Small 
amounts of comparative ethnographic work have been done, but for the most 
part developmental methods are insulBcient for studying the effects of 
various training interventions. 

Nonetheless, we had hoped to use such methodologies as an adjunct 
to our experimental manipulations. Indeed, in some of the studies reported 
below, we did take protocols In order to better understand how subjects were 
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trying to ieam to recognize various patterns. However, otir failure to 
predictably generate impasse efifects in experimentally tractable ways kept us 
from pursuing the developmental approach very far. We did. however, get 
some sense in a few of our studies of the ways in which subjects were trying 
to sort out yAmi they were seeing and therefore of the mental models that 
they had for the domains we used. 

Summaiy of Eacperlmental ££foits 

Since the fall of 1986. a total of 19 experiments were designed in 
which at least one subject was run. Because the experiments used displays 
generated by complex rul^, all of the esqjerlments were conducted on Xerox 
artlPcial intelligsnce workstations. The programs used to generate the 
displays and to conduct the experiments are available from the authors and 
will be sent without charge to anyone on the ONR Cognitive Science maiUng 
list who requests them. The following is a suzxmiary of these e7q>eriments 
and their results. Individual reports of the experiments give more detailed 
descriptions of the experiments (see "Available Software and Data"). 

Our first attempts to produce reliable and experimentally tractable 
impasses used extreme^ noisy displays of known object form cla >ses. such 
as animals and airplanes. We chose these displays in the hope that this 
would allow us to keep the tasks simple enough to fit standard experimental 
paradigms and time constraints. We then tried using dlspla3rs that 
resembled the segmented dlgfts used on LCD watches. Finally, we conducted 
an extensive series of studies using artificially created displays that 
resembled voice spectrograms 

Lost Plane Eaqperiments: September 1986 - December 1986 

Two experiments were conducted in v/hich subjects studied three 
different drawings of military plantjj and then wae given a series of visual 
search trials in which they were to identify the plane that appeared on the 
screen and its directional orientation (the latter a control for ^ lessing). The 
planes were obscured by a moderate amount of random line nc*se Oines or 
curves of random length and orientation) and randomly strewn plane parts 
(wings or tails). The two versions of the experiment, called Easy Planes and 
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Figure 2, Easy plane facing east with wing noise. 



Hard Planes, diffa-ed only In the amount of random line noise used. Figures 
2 and 3 show examples of an easy and a hard case. 

Method . There were three different plane silhouettes, and the task 
was to learn to identify which plane was hidden in the display. The 
manipulated variables for the experiments were the Plane Identity (A, B. or 
C), the Orientation of the plane (8 compass values), and the type of Plane 
Parts used as masking noise (either wings from Plane A. or tails from plane 
C). CombinaUons of these variables produced 48 different pictures which 
were presented to the subject in 4 blocks of 12 trials. Twenty subjects 
participated in the Easy Planes experiment, and six participated in the Hard 
Planes experiment. 
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Figure 3. Hard plane facing northeast with tail noise. 



Results . Because otir focus was on reliably generating learning 
impasses.we could not fully control all variables. Specifically, the design of 
the experiments unsystematically confounded Orientation with Learning 
Block. Hence, a full factorial analysis could not be performed. This sliould 
be kept in mind when considering the following results. For the Ea^ Planes 
e?ipenment mean proportion correct aver learning blocks increased linearly 
from 0.55 to 0.92 while response time decreased lineariy frori 33.82 seconds 
to 16.43 seconds. There were no s)rstematlc learning differences for the 
different Plane Idmtities or Parts Masks. For the Hard Planes experiment, 
mean proportion correct increased linearly from 0.44 to 0.79 ever learning 
blocks as response time decreased from 55.27 to 41.45 seconds. Again no 
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systematic learning diflerences were observed for either Plane identity or 
Fiarts Mask type. No laming impasses were observed. 

Lost Animal l^cpeiiments: Ncmsmbeir 1986 - October 1987 

Hie lost animals experiments were similar in principle to the lost 
planes experiments. Generally, subjects were shown outline drawings of five 
animals to study, and were thai presented with i^veral visual sraxx±i trials 
where they were to identify an animal and specify its orientation. Altogether, 
seven lost animals experiments were conducted. These included 
manipulations of noise type (E^asy Animals and Hard Animals), manipulation 
of the subject's advance knowledge of the animal shapes and identities (Free 
Res{x>nse Animals), extoided practice on the dilBcult animals task by the 
experimenters (Extend«i Animate, and Nanimals), and comparison of learning 
ability with parts masks which were inward projecting, where the parts could 
belong to animals within the picture, or outward projecting, where the parts 
could not belong to animals within the picture (Reversed Animals and 
Within Animals). 



Easy Animcds and Hard Antinals Experiments 

The Easy and Hard Animals experiments were basically the same in 
design as the Lost Plane experiments. Subjects viewed Ave outline drawings 
of animals and then performed a visual s^rch task where they specified 
which animal was depicted and which orientation it faced. In the Easy 
Animals Experiment the animals were shown with one of two types of 
random line noise: either straight lines or curved lines. In the Hard Animals 
esq^erlment the random line noise was augmented with a mask made up of 
animal parts (e.g., kangaroo tall, elephant trunk, etc.). 

Method . Ihe manipulated variables were Animal Identity (Penguin. 
Camel, Rhinoceros. Kangaroo. Elephant). Orientation (four primary compass 
values), and Noise Type (straight or curved). Combinations of these variables 
produced 40 dilTerent pictures whicn were shown to subjects In blocks of 10 
trials. Sixteen subjects participated in each of the Easy and Hard Animals 
experiments, but no subject participated in both experiments. 
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Results . As was the case for the Lost Planes experiments, the Lost 
Animals experiments also unsystemati%:ally confounded Orientation with 
Learning Block. Hence, no full factorial analysis was possible. Keeping this 
In mind, the mean proportion correct for the Easy Animals experiment 
increased sli^Uy with learning block. The values range from 0.80 to 0.89. 
At the same time, r^ponse time decreased from 15.76 seconds to 9.66 
seconds with learning block. So, again there were no reliable impasse 
effects. No f^tematlc learning differences between animals were found, but 
animals disguised in straight line noise were more often detected than 
animals disguised in cxirved noise. Straight line noise accuracy was at 
ceiling on all four learning Uocks. but Curved line noise accuracy appeared 
to improve from 0.67 to 0.84. 

The results for the Hard Animals experiment were that subjects 
performed only slightly above chance during the experiment and never 
improved (0.10 on block 1 to 0.11 on block 4: chance was 0.05). Subjects 
were otify slightly more accurate on animals masked by straight line noise 
(0.13) than on animals masked by curved line noise (0.09). It was this 
finding of an apparent impasse that kept us persisting with the animal 
detection studies. 



Extended Practice Animals and NanUnals Experiments 

To discover whether the Hard Animals task could be learned, the 
experimenters performed the task over several sessions. In the Extended 
Practice experiment, two experimenters (MM and GG) familiar with the task 
performed it 8 times. In the Nanimals experiment, an experimenter (JT) 
unfamiliar with the ta^ performed it 20 times. In this latter experiment, 
different parts masks were used on each trial to prevent improvement due to 
learning the position of the distractots. 

Method . The cjqjeriment was the standard Hard Animals experiment 
described above. For the Nanimals experiment, the animal parts mask was 
changed on each problem to prevent the position of the distractors from 
being learned. However, the same set of masks were used on each session. 



Lrsgpkl. UnKvrslty of Pittsbuigh 16 
Flna) Repent; ^f000t4 66-K-0361 

Results . Again, no &ctorial analysis of the results wlU be presented, 
but overall improv^ent In accuracy and response time was found. That Is, 
0ven adequate practice, learning occurred continuously without Impasse. 
For the Extended Practice experiment, one subject (GO) began with ceiling 
accuracy and decreased in response time from a mean of 27.72 seconds on 
the first block of the first session to a mean of 4.83 seconds on the final 
block of the 8th session. The other subject (MM) reached ceiling accuracy on 
the second session and decreased in response time from a mean of 67.42 
seconds on the first block of the second session to 1 1.91 seconds on the last 
block of the 8th session. 

For the Nanimals experiment, the subject achieved an accuracy of 
0. 10 on the first session (comparable to the performance of subjects in the 
Hard Animals experiment) and reached celling accuracy by about the 7lh 
session. From this point, response time decreased from 26.34 seconds on 
the tot block of the 7th session to 9.80 seconds on the final block of the 
20th session. Again, the basic finding Is that the task, too dilBcult for the 
time constraints of ordinary laboratory experimentation, showed no real 
impasses when adequate training time was given. 



Reversed Animals and Within Anitnals ExperUnents 

Zvtn though continuous learning took place if enough trials were 
given, the hard animals tasks could, on the right time scale, be seen as 
involving impasses in lining, at least for the less-moUvated subjects we 
recruited (relaUve to our own staff in the extended studies). So. we tried to 
find controlled means for making the difliculty of the hard animals conditions 
come and go. These exj^rlments examined whether the search difliculty 
crated by the animal parts mask (as was found in the Hard Animals 
experiment) was due to subjects being misled into examining the parts 
contained in the mask. The parts mask used by the Hard Animals 
experiment located animal parts so that if the rest of the animal were 
attached to the part, the whole animal would appear within the stimulus 
picture. For this reason, the mask was called "Inward projecting." A second 
mask was designed which located the same parts so that if the rest of the 
animal were attached to the part, most of the animal would be located 
outside of the stimulus picture. This second mask was called "outward 
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projecting." The reasoning behind the experiments was that If subjects were 
testing part hypotheses during their s^rch. they should be more disrupted 
by the inward projecting mask, wbost parts they would have to t^t, than by 
the outward projecting mask, whose parts they should be able to quickly 
reject as potential targets. Hie two experiments dllier in that the Reversed 
Animals ejcperiment uses a between-subject design wh?le the Within Animals 
experiment uses a within-subject design. 

Method . For the Reversed Animals experiment, eight subjects were 
mn in the standard Hard Animals ocpariment (to establish continuity with 
the previous experiment for this subject group) which used the inward 
projecting mask. Sixteen subjects were run in the same task excqpt that the 
outward projecting ma^ was used in place of the inward projecting one. For 
the Within Animals experlmcait, the straight and curved line noise masks 
were replaced with a sin^e mask which combined half straight and half 
curved noise. Subjects then saw the all of the animal patterns once with the 
inward projecting mask and once with the outward projecting mask. 

Results . The results of the Reversed Animals experiment were that the 
subjects who searched for animals in outward projecting parts noise 
identified about twice as many animals as the original Hard Animals subjects 
(0.24 vs O.I OK but about the same as the comj^rison group given the Hard 
Animals task (0.23). Neither the inward nor outward projecting groups 
improved over blocks. This suggested that whatever impasses we were 
observing before were motivational and not cognitive. 

The results of the Within Animals experiment were that subjects 
responded feister to the outward projecting problems than to the inward 
projecting ones (57 seconds vs 38 seconds), but the accuracy on the two 
types of problems was the same (0.32 vs 0.38. respectively) and greater than 
chance. 



LCD Bxpettoient: Septeia1>er 1987 

The LCD oqjeriment looked at 
reasoning task. The subjects were 
resembling an LCD numeral display. 

o 
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transfer of learning in a diagnostic 
to diagnose a "fault" in a display 
In each problem in this series, a 
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Simulated fault caused one or more segments of the seven-segment display 
either to be always on. always off. or revensed: off when it should be on and 
on when it should be off. The subjects, by calling for the display of digits 
from 0 to 9. were to determine which segmenUs) were affected and by which 
fault. Two transfer conditions and one control condition were used to 
determine whether learning on a more simple version of the task would 
produce negative transfer to a more complex version. 

Method , Fifteen subjects were divided into three conditions. All 
subjects participated in two experimental sessions. In the first condition, 
subjects performed a simple version of the task on the 0rst session and then 
transferred to the full task on the second session. The simple version used 
problems which had only one affected segment, which was either always on 
or always off. In the full version of the tai^. problems could have either one 
or two affected segments and could be reversed, always on, or always off. In 
the second condition, subjects performed a task which was more complex 
than the simple task, but less complex than the full task, before transferring 
to the full task. In this moderately complex task, problems had only one 
affected segment, but it could be always on, always off. or reversed. On their 
second session, these subjects performed the fuU task. Finally, the third 
condition received the full task on both sessions. The dependent variable 
was the proportion of coirect responses (both segment and disease correct). 

Results . Difference scores between proportion correct on first and 
second sessions were calculated for each subject. The mean values were - 
.108 for the first condition. 0.010 for the second condiUon. and 0.030 for 
the third condition. Bonferroni t-tests revealed that subjects who 
eaiperienced the simple version -jf the task in the first session showed 
signlflc^t negative transfer relative to those who experienced the full task (p 
< .05) but that those exp^endng the moderately complex task in the first 
session did not show significantly more negative transfer [p > .05). 

Spectrognun Learning Exiierlments: November 1987 - June 1989 

We shared with the ONR technical monitor the belief that the LCD 
studies were not as interesting a direction to pursue as the more perceptual 
possibilities we were con^derlng and therefore ceased experimentation m this 
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Une. Hie remainder of our studies used artificially produced voice 
spectrograms, displays in which time was ploUed on the x axis and 
frequency on the y axis, with daiione^ of a posiUon showing the amount of 
sound energy of that frequency present at that time. Figure 4 shows an 
example of the type of display that we used. 

Nine experiments were run using pseudo-speech spectrograms as 
stimuli. The first studies used a scaling methodology to try to determtoe 
which visual dimensions of vowel patterns naive subjects would attend to 
IVowel Scaling experiment and Scale-Lcam-Scale ejqjerJment). This was 
followed by experiments which looked at the laming of vowel patterns 
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Figure 4. Example of artiffdal speech spectrogram. 
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(Vowel Transfer experiment), real word patterns {Real Word learning 
eaqjerlment), and finally consonant patterns (Consonant Discrimination 
experiments I, n, and ni), A small experiment was also performed which 
tried to examine the influence of subjects' conceptual understanding of 
speech on their spectrogram readmg performance (Instructional Modt'1 
experiment). 

To understand the logic of the experiments, a fev/ facts about speech 
spectrograms are worth noting. There are two types of phonemes, vowels 
and consonants. Vowels consist primarily of sound eneigy clustered Into 
three main frequency bands, and these bands stay at about the same 
frequency for a relatively long tlrae. Consonants, on the other hand, tend to 
involve iSaster changes in frequency and somewhat less clustering around a 
small number of core frequencies. caUed forrmmts. This substantial 
diflerence in appearance makes It highly likely that even « naive viewer will 
parse a spectrogram display into regions demarcated by phoneme 
boundaries. Critical^ Important to our design is the fact that some 
consonants are indistinguishable from one another if one looks only at the 
part of the spectrogram associated with the temporal duration of the 
consonant. Rather, these consonants must be distinguished by examining 
the effects of the lip and mouth movements they involve on either preceding 
or following vowels. In particular, /d/ and /g/ are disUnguished by their 
eflfects on the vowel which follows them, either "pulling" the start of the 
second and third formants together to the point of overlap or not. 

This has two effects. First, vowel displays vary depending on the 
consonant context in which they appear. However, there are certain aspects 
to vowel displajrs that are constant. These become the critical features for 
identifying vowels. For identifying consonants, on the other hand, one must 
consider not only the part of the display showing the consonant's acoustic 
effect but also the neighboring vowd. Further, what is noise with respect to 
vowel Identiflcation Is criUcal to neighboring consonant identification. So. 
identifying certain consonants like /d/ and /g/ requires noticing that part of 
the neighboring vowel context is relevant and, in particular, that the relevant 
part is the part that is more or less irrelevant to vowel IdenUficaUon. 

We expected that Impasses would occur whenever perceptual learning 
tasks involved distinguishing syllabies that differed in whether they began 
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With /d/ or /g/, because the needed information for deciding on the 
distinction was spread over two different regions of the display and because 
the vowel context information needed was the "noise" with respect to vowel 
identincatlon. The series of studies we conducted Included some in which we 
tried to gather baseline data on feature salience and others in which we 
looked directly for the impasse effect. 



Vowel Scaling EsQjerimeni and Scaie-Leam'Scale ExpeHment 

The scaling experiments were, in essence, baseline studies. A 
computer prognamn was written to generr^e pseudo-speech spectrogram 
patterns based on ftraiture descriptions of real spectro^ms. The first 
patterns generated were vowels in a standard form {no distorting consonant 
context, horizontal formants) and in a transformed form {curved formants as* 
wTuld result from consonants immedlatels'' before or after). To compare how 
similarly subjects would regard the transformed and tb'; standard vowel 
fonnants. two scaling studies were done. In the first, subjects saw all 
pairwise combinations of 11 vowels in standard and transformed form and 
rated the similarity of «ich pair on a nur-erical scale. These values were 
entered Into a multidimensional scaling analysis. In the second experiment, 
a different group of subjects made similarity Judgments on the 11 standard 
vowel patterns, then learned to distinguish the patterns, and finally, scaled 
the patterns again. This was done to see v.'hether learning would change 
how subjects saw the patterns. 

Method . In the first scaling experiment subjects scaled all painvisc 
combinations of 22 patterns {11 standard and U transformed for a total of 
231 pairs). Each pair appeared on a computer screen along with a scale 
ranging from 1 (not similar) to 7 {very similar). Wineieen subjects rated the 
similarity of the 231 pairs. 

In the second experiment, five subjects rated the similarity of 55 pairs 
of vowels (pairwise combinations of the U standard vowels), then learned to 
identify the different vowels, and finally rated them again. The rating 
procedure was the same as in the Vowel Scaling experiment. The learning 
procedure had subjects view the 1 1 vowels in a random order and select the 
name of the vowel from a screen menu. If the response was incorrect, the 
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subject was given tlie correct name, 
number of times the subject had to go 
aU right 
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The measure of leaaimg was the 
through the list before getting them 



Results. The data were scaled using ALSCAL. a nonmetrlc. 
mulUdlmensional scaling fM^gjnam. and INDSCAL. a related program that also 
cxaminra dlflTermces between Individual subjects' data. For the simple 
scaling experiment, the most meaningful ALSCAL solution was found with 
three dimensions. Howler, the stress value of this soluUon was 0.267 
indicating that it was not a very good fit. Nevelheless, this solution tended 
to separate the patterns according tc whether they were standard or 
transformed, whether they wane low or high vowels (second formant height), 
and whether the formants were transformed by a sligjht l)ending (such as 
that which occurs when a vowd follows a bilabial stop) or by a convergence 
of the second and third formants (such as that which occurs when a vowel 
follows a velar stop). 

For the Scale-Leam-Scale experUnent, the scaling of the flrst rating 
achieved a stress of 0.199 in three dimensions, but only two of those 
dimensions, second formant height and vowel uldth. were readily 
interpretable. An jrODSCAL soluUon indicated that most of the subjects 
weighted second formant height higher than both vowel width and the 
uninterpreted third dimension. On the Learning task, subjects took an 
average of 16.4 attempts to learn the 11 vowels. After learning, the subjects 
again rated the slmikxrity of the vowels. On this second raUng. their scaling 
solution looked similar to the first one. The three dimensional solution 
achieved a stress of 0.184 and again the recognizable dimensions were 
second formant height and vowel width. An INDSCAL soItiUon was found for 
this second scaling and a comparison of the two revealed that most subjects 
increased their weighting of second formant height and decrrased their 
weighting of vowd width. This indicate that learning may have sensitized 
them to using the second formant as a t>asis for discrimination and thus 
caused them to become less sensiUve to the information that might help in 
distinguishing a prior consonant like /d/ or /g/. 
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Vowel Transfer Experiment 

One way people might be taught to recognl?^ vowel patterns is by 
training them on the standard vowel forms (which are never encountered 
when "reading" spectrograms of continuous speech) and expecting this 
training to transfer to the transformed cases the learner will encounter. It is 
also reasonable to expect this might not work. If subjects attend to the 
wrong aspects of the standard form, or don't recognize the transformed vowel 
as an exemplar of the standard form, no transfer would be expected. The 
Vowd 'ftansfer experiment was designed to see whether this expectation was 
reasonable. The expolment compared transfer from the standard vowel 
patterns to the transformed vowel patterns with transfer in the opposite 
direction. 

Method . Eight subjects were divided into two groups of four. One 
group was ^ven the task cf learning the standard vowels followed by the 
task of learning the transformed vowels. The second group received the 
same tasks but in the reverse order. The laming tasks were the same as 
the one described in the Scale-Leam-Scale experiment Subjects saw 11 
vowels one at a time in random order and learned to identify them by 
selecting their names from a screen menu. If subjects were wrong, they were 
told wtiich answer was correct. The learning criterion was one errorless pass 
throu^ the 1 1 vowels. 

Results . Subjects in the first condition, who learned the standard 
vowels first, took an average of 28 blocks to leara the first set of vowels and 
an average of 7.25 blocks to learn the second. Subjects in the second 
condition, who learned the transformed vowels first, took an average of 1 1.25 
blocks to learn the first task, and also took an average of 11.25 blocks to 
Icam the second task. Learning to discriminate the transformed vowels was 
easier than learning to discriminate the standard vowels, likely because the 
transformed vowels are less similar to each other. However, learning the 
transformed vowels first produced a savings of 16.75 blocks on learning the 
standard vowels, while learning the standard vowels ilrst only produced a 
savings of 4 biodks on learning the transformed vowels. 
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Real WcnL Learning E^qjeriment 

The Real Word Learning experiment examined the learning of English 
words made up of a stop consonant followed by a vowel followed by another 
stop consonant. A pseudo-spectrogram pattern was displayed on the screen 
and subjects were free to type In any word they chose as a response. The 
computer was programmed to detect alternate spellings of the target word 
and provided feedback wben subjects made an error. 

Method . Nine subjects were ^own as mmy words as time permitted 
in a two hour eaqjeriment session (at least 110 and as many as 160). One 
subject's data was excluded because he was not a native English sp>eaker. 
The subjects were free to respond with whatever word they wished, but most 
of them quickly learned the three letter nature of the patterns. The subjects* 
performance was examii*ed by looking at the total number of correct 
phononaes In Intervals of 10 trlsis. 

Results . The general result was that the subjects showed quick initial 
learning which appeared to level off at less than perfect performance. 
Assuming subjects quickly learned the set of possible responses Irom the 
feedback they were given (i.e.. that there were only six possible consonants 
and six possible vowels), two subjects showed chance performance with no 
improvement. The remaining six subjects each showed either abrupt or 
gradual initial improvement which reached a plat^u between 50% and 75% 
correct. Looking at how subjects performed pn individual phonemes revealed 
that /b/ and postvocalic /p/ were learned lairly quickly, followed by /d/, 
/t/, and prevocaUc /p/, but most subjects had difficulty learning to identify 
/k/ and /g/. What these two patterns had in common was that they were 
identical to another letter (/k/ was identical to /t/ and /g/ was identical to 
/d/) except for their effect on the adjacent vowel. Most stops cause the 
formants of an adjacent vowel to curve slightly down at the consonant-vowel 
boundary, but the velar stoi» /k/ and /g/ cause the second and third 
formants of the vowel to curve together and meet at the consonant-vowel 
boundary. Subjects apparently had difficulty establishing that this difference 
could signal the distinction between /d/ and /g/ or /t/ and /k/. 

To establish that full learning would eventually occur on this task (i.e.. 
that subjects were not at a permanent impasse), an additional subject was 
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run for a total of seven consecutive sessions {1\13 trtals) and showed steady 
Initial Improvement for the first two sessions which appeared to level off 
during the third and fourth se^ons before resuming to ceiling perfc nance. 
Ihls finding suggests that although learning appeared to plateau early for the 
first group of subjects. It would likefy resume improving until it reached 
ceiling. This plateau appears to be due to the dlflttculty distinguishing the 
/d/ patterns firom the /g/ pattons and the /t/ patterns from the /k/ 
patterns. This finding inspired the Consonant Discrimination Learning 
experiments which are described below. 

Insfiructfonol Model E^riment 

The purpose of this pilot e3q>eriment was to see if we could improve 
subjects' ability to learn to read the real word spectrograms by giving them 
information about how speech sounds are made and what components of the 
speech signal are represented in the spectrogram pattern. We looked at two 
types of knowledge: conceptual knowledge about how speech sounds are 
made, and specific cue knowledge £ibout which spectrogram features 
important for discriminating certain sounds. 

Method . Thirty-two subjects were divided into four groups. These 
groups were: Cue Alone. Model Alone. Separate Model and Cue. and 
Integrated Model and Cue. The groups differed according to the verbal 
instructions given to the subjects. In the Cue Alone condition, subjects were 
shown a table which distinguished the six stop consonants and six vowels by 
visual features of their spectral representation. These cues included striatlon 
(volcingj, width (duration), daik; spots (formants), dark band height (place of 
articulation), and dark band curving (coartlculatlon eflfects). The subjects 
were told how they could use these cues to distinguish the consonants and 
vowels. In the Model Alone condition, subjects were shown a table which 
distinguished the consonants and vowels according to articulatory features 
(listed in parentheses above), but verbal instructions did not relate these 
features to any visual spectrogram features. In the Separate Model and Cue 
condition, subjects received all of the information in the Model Alone and 
Cue Alone Conditions, but this information was not related together in the 
verbal instructions. Finally, in the Integrated Model and Cue condition, all of 
the model and cue information was given and tied together in the verbal 
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instructions. After receiving these instructions, subjects were given the Real 
Word Learning expalment previously described. Subjects viewed a total of 
74 words. Their performance on the first 10 words and the last 10 words 
was measured. On the intervening problems, subjects had access to a help 
window which displ^red the tables they had se«i during instruction. The 
dilTerence between their performance on the first 10 trials and the last 10 
trials was used as a measure of their improvement 

Rfflult^ . The mean number of phonemes correctly identiHed on the 
first 10 problems over all subjects was 4.53. Because subjects knew that 
there were only six possible responses for each of the three phonemes in a 
pattern, chance performance on a block of 10 trials was 5.0 phonemes. A t- 
test showed that this first block poformanee was not better than chance 
t(31}=:1.49, p > .05: and none of the means for the four instructional 
conditions deviated significantly fiiom the others (range was 4.12 to 5.0). The 
mean nmnber of phonemes correctly identified on the last 10 problems over 
all subjects was 11.16. An axml3r^ of variance was performed to compare 
whether the difierence in first and last block performiance varied with 
condition. The analysis foimd that ^though significant laming occuired 
Ijetween the first and last block. F[1.24)=40.96. p < .001. this Improvement 
was equal for all instructional conditions. Ff3,24)=0.98, p > 0.40. 

One other measure of interest was the number of times subjects in 
each condition used the help screen. The results showed that subjects in 
the Model Alone condition used the help screen the least, an average of 4.75 
times. Subjects In the Cue Alone and Integrated Model and Cue condition 
used the facility the same amount, an average of 8.78 and 8.75 times 
respectively. The subjects in the Separate Model and Cue condiUon used the 
help facility the most, an average of 10.38 times. These values may reflect 
how useful the subjects in these conditions thought the heip information 
was, but this did not appear to affect their learning veiy much. 

The conclusion of this study was that no instructional effect was found 
for this task. The reasons are not clear, but it is likely that subjects not 
adequately learn the instructional material and could not make use of it 
during practice. No effort was made to assess the extent of their learning of 
the instructional material, so this explanation is unverified. 
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Consonant tHstsimtnation Learning Esqperiment I 

In the Real Word Learning esqpeilinent. it was (Served that subjects 
had mwe difficulty learning consoxmits which had to be distinguished by a 
vowel feature (fonnant curvature). The jObrst Consonant DlscriminaUon 
Learning experiment was undertaken to test whether this was a real effect, or 
whether it was due to the unequal number of consonants in each of the 
learning blocks. The basic design of this experiment was the same as the 
Real Word Learning experiment; but subjects were given all C-V-C 
combinations of the consonants and vowels, and they were not told of any 
relatlonshtp between patterns and real words. Subjects responded by 
selecting consonant and vowel names from a menu rather than typing in the 
word. Feedt>ack was provided .on error trials. 

Method . Ten subjects v.-cre shown pseudo-spectrogram patterns of all 
CVC combinations of the consonants /b/, /p/, /d/. /g/, /t/, /k/ and 
vowels /I/, /e/, /ae/. /P/. /u/. /o/. This produced 216 patterns, which 
were shown over three to four sessions. The patterns were divided into 
blocks of twelve, so that each consonant appeared in prcvocalic and 
postvocalic form twice, and each vowel appeared twice. The presentation of 
these blocks and the order of imttems within a block was randomized. 
Subjects were also questioned verbally about their hypotheses and intuitions 
about the task. The stimuli were drawn so that /b/ and /p/ appeared 
similar but could be distinguished by more than one feature (such as texture 
and shading): /%/ and /k/ appeared similar but could be distinguished by a 
single feature {number of dark spots inside their pattern); and /d/ and /g/ 
appeared identical but could be distinguished by the curving of the adjacent 
vowel's formants {/g/ caused the formants to curve together). The block on 
which subjects learned to distinguish each of thest; three pairs was the main 
dependent variable. 

Results . Subjects were considered to have learned a pair if they 
responded correctly on four consecutive blocks with only one error. Of the 
10 subjects. 9 learned the /b/-/p/ distincUon. 6 learned the /t/-/k/ 
distinction, and 2 learned the /d/-/g/ disUncUon. McNemar's exact test for 
correlated proportions showed that significantly more people learned the /b/- 
/p/ distincUon than learned the /d/-/g/ distincUon [p < .02). but the test of 
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whether more people learned the /i/'/)s./ distinction than learned the /d/- 
/g/ distinction was non-slgniflcant (pclO). A matched pairs sign test was 
ised to test lA^iich distinctions were learned earlier than the others. This 
test revealed that the /b/-/p/ and /t/-/k/ distinctions were learned earlier 
than the /d/-/g/ distinction (p < .01 and p < .02 rrapecttvefy). 

These results appear to have verified the previous findii It was more 
difficult to learn a discrimination if the critical feature is in another part (in 
a vowel in this case). However, it is not certain whether this effect is due to 
s^mentation. the salience of the cues, or some other &ctor. The third 
Consonant Discrimination Learning experiment followed up this question. 

Consommt Discrimination. Leamir^ Exp&lnmU U 

The next Consonant Discrimination Learning experiment looked at 
whether the random noise added to the spectrogram pattmis had any 
influence on the difiiculty of learning the patterns. Presumably, if people are 
biased towards looking within a part for a feature which will identify it. then 
the presence of random noise will supply more hypotheses for them to 
consider than if the random noise were not present The task In this 
experiment was simplified by using only the /d/-/g/ and /t/-/k/ consonant 
distinctions and only one consonant in each pattern. The presence of noise 
(random edging) was varied between subjects. 

Method . The patterns shown to subjects were all C-V combinations of 
the consonants /d/. /g/. /t/. /k/ and the vowels /I/, /o/, /ae/, /e/. The 
16 different patterns were sho^.n 18 times for a total of 288 trials. In the 
no-noise condition, these patterns speared with straight edges, in the noise 
condition, the lengths of the lines us«i to draw the jetton were set to a 
random number within about 6 mm firom a set ending point. For both 
conditions, the problems were divided into blocks of four, where each 
consonant and vowel speared once. The subjects responded separately to 
the consonant and vowel by selecting the symbol for each from a screen 
menu. The major dependent variable was the block on which a subject 
learned the /d/-/g/ and /t/-/k/ distinctions. Twelve subjects were run to 
obtain 4 full or partial learners in each condition. 
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Results . All four non-learners were in the noise condition. In the no- 
noise condition, three of the subjects learned the /t/-/k/ distinction before 
the /d/-/g/ distinction. In the noise condition, two subjects l^med the /t/- 
/k/ distinction first, and two learned the /d/-/g/ distinction first. Not 
enough subjects were run to perform any statistical tests. The results do 
appear to suggest that the addition of the random noise made the task 
somewhat more difllcult to learn. 

Consmant DiscriminatUm Learning Experiment (Selection TtisW 

Another question that occurred to us was whether the subjects learned 
the /d/-/g/ distinction last simply because it was more dUBcult, or whether 
they had to learn all of the other distinctions first to eliminate other features 
from consideration. Would we still find this same learning order if subjects 
could select which stimulus patterns they could see? To test this, we set up 
an experiment in which a subject responded to one block of trials in the 
same way as in the previous experiment, but then for the next block of trials 
could select which patterns to see by selecting the appropriate phonemes. 

Method . It was necessary to run only one subject on this mixed 
presentation/selection task. 

Results . The basic result is that the subject learned the /b/-/p/ 
distinction first, but then focused on the /d/-/g/ distinction and learned it 
before the /t/-/k/ distinction. 



Consonant Dtscriminatim Learning Experiment El 

The final Consonant Discrimination Learning experiment tried to 
discover whether the learning dilBculty associated with the vowel 
transformation cue was attributable to segmentation or some other factor 
such as salience. This experiment used a complex design to control for 
salience and task demands, but used the same task as the Noise condition 
in the second Consonant Discrimination experiment. 

Method . To control for any differences in cue salience, each type of 
cue. the formant curving cue (/d/-/g/ distinction) and the number of 
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fonnants cue {/i/-/k/ distinction) was presented both within the phoneme 
being learned and outside of it in another pari. Because this could not be 
done using a within subjects design, an inccmiplete blocks design was used. 
A pair of subjects iH-ovided one observation for both cues presented within 
and outside of a part Thus, any difference In salience between the two cues 
should equally affect within and between object discriminaUons. To control 
for ai^ task demands which ixmy be produced by associating different parts 
of the pattern with different responses, subjects made a single consonant 
response to the whole pattern and never made a separate response for 
vowels. However, half of the wibject pairs were gven instructions biasing 
than to look at either the consonant or vowel {whichever contained the 
within object cue). Trials were divided into 8 problem blocks with each 
consonant represented twice and each of four vowels represented once. The 
block on which a subject learned one of the consonant distinctions was the 
major dependent variable. 

^Wltg- Subjects were considered to have learned a consonant 
distinction if they were correct on two consecutive blocks with one allowed 
error on the second block. Eighteen of the subjects learned both the within 
part distinction and the between part distinction, 13 learned only the within 
part distinction. 5 Iramed only the between part distincUon, and 12 learned 
neither distinction and were not included in the anatysls. Matched pairs 
sigh tests were performed to determine which distinctions were more dinicult. 
These tests revealed that the number of fonnants cue was more dilBcult to 
learn than the fonnant curving cue when the cues were between parts, but 
there was no difference between the two cues when they were within a part. 
This indicates that segmentation Interacts with cue salience to produce 
learning difficulty. 

However, the pattern of these results did not reproduce those reported 
In the first Consonant Discrimination Learning experiment. This is most 
Ukefy due to the change in the task. Subjects in the previous experiment 
responded to both consonants and vowels, but subjects in the present 
experiment only made a consonant response to the whole pattern. Subjects 
making the vowel response likely thou^t the formant curving was relevant to 
vowel identity and failed to use It to distinguish the consonants. When the 
necessity of making a vowel IdentiflcaUon was removed, subjects could 
consider smy feature relevant to the consonant identity. 
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The results of this experiment Indicate that subjects may be biased 
towards searching within a part for its distinguishing features and that this 
bias be enhanced when other task demands make use of any between 
part cues. 



Conclusions 

The studies performed, and other pilot efforts with similar outcomes, 
make it clear that a signi0canUy diOerent approach will be needed if progress 
is to be made on impasses m perceptual learning. We did try other 
approaches, including extensive taking of protocols and probing for 
hypotheses al 'yat what characterized various displays. However, we were not 
able to gEOn suflldent control over the generation of impasses to have them 
occur reliably, for most of our subjects, and over multiple experiments. Yet 
there were, along the way. striking examples of extended periods in which 
little or no learning took place. 

For example, in some of our studio that showed Impasses, at least 
temporarily, we were able to fit individual subjects' data with models that 
claimed performance to be constant at one level until it rose, rather quickiy. 
to a second level. This type of model is relatively consistent with the Zeaman 
and House (1963) representation of learning as consisting of a period in 
which there is a search for relevant features followed by rapid learning of the 
mappings of those featiures onto categories. Figure 5 shows the data for one 
student on Consonant Discrimination Learning Experiment I. The problem 
was not that we never got such nice impasse patterns; rather it was that we 
never gained control over when they would appear. Indeed, the same 
experiment yielded protocols supporting the difficulty subjects had in noticing 
feature clusters that crossed meaningfiil unit (phoneme) boundaries. 

We conclude that the best available tools for studying impasses in 
learning are probabfy tht >ls used in compamtive exj^ise ("expert-novice") 
research, rather than those of the learning study. That is. one must And 
natural situations in which impasses occur over periods of extended learning 
practice and carefully assess performance at benchmark points in the course 
of such apprenticeship. Independent of circumstances, the time one can 
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have to work with a research subject is always limited, and for the present 
purpose, it she uld be invested in understanding a current state of knowledge 
rather than tryi*ig to induce a new state that may take too long to appear, 
to a sense, then, the original radiological ejqpertlse studies may have been 
closer to the ri^t approach than the work undertaken in the present project. 

We did demonstrate impasses, thou^ and ovir views of why they 
occur and how they might be overcome still seem reasonable. Specifically, 
impasses arise when the relevant features -of a situation are not apparent. 
Because feature noticing is extremely well developed in humans, this problem 
generally arises onfy when (a) the features defining a categoiy are tied, by the 
Gestalt rules and prior knowledge of the environm«it, more closely to 
features relevant to other domain tasks than to each other; (b) a mental 
model of how the displajrs come to look they way they do has not been 
acquired or is not mentally manfpulable with fiacility: and Ic) no advice (rules) 
on how to parse the display have been acquired. Some of the displays that 
arise in modem technological application have these characteristics. Further, 
because the display forms are d^gned by experts, no one may notice that 
they have the shortcomings Just mentioned. 



Available Software and Data 



Longer reports of each of the experiments descrit)ed above, including 
photocopies of the display screens, are available without charge to any 
researcher on the ONR cognitive science mailing list. Other researchers will 
he accommodated but may have to pay reproduction costs if supplies run 
out. Similarly, the Interlisp software to produce the sUmull and run the 
experiments is also available under the same terms. A technical report 
describing the last few studies is being issued simultaneously with this final 
report. Address all inquiries to Alan Lesgold. LRDC, University of Pittsburgh, 
Pittsburgh. PA 15260. 
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